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Abstract 

The objective of this paper is to estimate smartphones’ location which support 
services that demand lane-level precision like high-occupancy vehicle (HOV), 
lane Estimated Time of Arrival (ETA) estimation. We focus on developing a 
model based on raw location measurements collected in an open sky and light 
urban roads using datasets collected by hosts from Android smartphones. The 
application of mobile devices for most software products built for services such 
as cadastral surveying, mapping surveying applications, and navigation has 
been increasing due to the cost-effectiveness of GNSS smartphones. This paper 
aims to bridge the link between the geospatial information of detailed human 
behavior and the smartphone internet with improved granularity. It fixes the 
issue with the GNSS/INS integrated navigation system’s degrading data accu- 
racy during an GNSS signal outage. We aim to improve the currently used 
GNSS/INS integration algorithm built on the AI approach. The position of a 
vehicle during a GNSS loss can be predicted utilizing a GNSS/INS integration 
methodology for land vehicle navigation based on position update architecture 
(PUA) employing LightGBM regression. It models the connection between INS 
data and changes in vehicle location using LightGBM. 
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1. Introduction igation system collects data on angular and linear 
motion in relation to inertial space. The system 
then employs inertial navigation differential equa- 
tions to determine changes in vehicle speed and 
location (Chiang et al.) . 

As with Android Nougat (2016) and newer ver- 
sions, users have had access to raw GNSS measure- 
ments. Hence, GNSS smartphone positioning has 
been a focus of considerable interest and study in 
recent years. Because of their low price, GNSS cell- 
phones may be used for a wide variety of tasks, from 
cadastral surveys and mapping to surveying and 


The most widely used technologies in transportation 
systems are navigation and positioning. (Everett et 
al.) Their techniques are designed to employ satel- 
lite data and vehicle dynamics data to compute the 
precise present location of a vehicle. The robust- 
ness of the algorithm is a crucial factor in determin- 
ing the precision of the vehicle navigation system 
and its ability to adjust to the surrounding environ- 
ment (Zangenehnejad and Gao) . 


GNSS/INS navigation is an affordable, precise, 


and flexible navigation solution and it provides real- 
time updates on a vehicle’s location mainly through 
satellite signals. (Zhong et al.) First, the inertial nav- 
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vehicle navigation, as well as for directing pedes- 
trians. (O Zhilinsky) 
Regardless of the increasing use of GNSS devices 
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for a broad range of applications, including cadastral 
surveying, mapping, and navigation, users remain 
confronted with a multitude of challenges that 
prevent them from achieving high accuracy loca- 
tion. (Fu, Khider, and Van Diggelen) These include 
the noise that is present in GNSS measurements, the 
direct effect that the environment has, the changes in 
the modes in which a smartphone may be held, and 
the constraints in the creation of algorithms. In spite 
of these obstacles, the use of GNSS apps on smart- 
phones has been steadily growing in popularity due 
to the fact that they are more affordable. (Shen et al.) 

Satellites are essential to the operation of the 
GNSS navigation system. GNSS satellites make a 
complete orbit around the earth and broadcast sig- 
nals that allow a receiver to determine how far away 
it is from each individual satellite. Their orbits have 
been mapped out, thus it is possible to determine 
where they are located. (Hegarty) 


2. Materials and Methods 
2.1. Dataset 


Google Inc.’s Android Raw GNSS Measurement 
Datasets for Precise Positioning is the one cho- 
sen for utilization. The files contain raw GNSS 
measurements and inertial sensor readings achieved 
through a range of dual-frequency and ADR tech- 
niques. In order to collect the information, smart- 
phones with (Accumulated Delta Range) capabili- 
ties, including the Xiaomi Mi 8 and Google Pixel 
4, were employed in the San Francisco Bay Area in 
the United States. The Global Navigation Satellite 
System (GNSS) is an umbrella term for satellite- 
based positioning systems such as GPS, Quasi- 
Zenith Satellite (QZSS), GLONASS, and Galileo. 
This type of surveying is extremely accurate because 
it uses radio waves emitted from the GNSS satellites 
that orbit the earth to determine coordinates. The 
receiver at the station can pick up the radio waves 
from the sky, thus allowing surveying to be per- 
formed regardless of weather conditions. GNSS sur- 
veying is currently the most popular form of geode- 
tic surveying since it can provide three-dimensional, 
high-precision results and improve the efficiency of 
surveying tasks. (Tiberius and Verbree) 

The coordinates collected by GNSS like car 
navigation systems and smartphones are usually 
expressed in the WGS 84 coordinate system. Both 
the WGS 84 coordinate system and the ITRF coor- 
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dinate system are earth-centered systems. Despite 
numerous revisions to WGS 84, it is still possible to 
consider the ITRF system to be equivalent and there 
is no significant practical difference. 


2.2. Data Cleaning 


When receiving information from satellites, it is 
often accompanied by unwanted noise. Data cleans- 
ing is a method of correcting, deleting, or recti- 
fying invalid, distorted, wrongly formatted, redun- 
dant, or incomplete data inside a dataset. As mul- 
tiple sources of data are often combined, there 
is a high chance of duplication or misidentifica- 
tion. (Wielgosz et al.) 


2.3. Exploratory Data Analysis (EDA) 


Visual techniques are utilized to break down data 
and find trends, models, or to check hypotheses with 
the support of numerical summations and graphi- 
cal illustrations. Exact positioning data is essen- 
tial to make smart phone tracking possible, how- 
ever, presently satellite positioning is inadequate. 
Utilizing an inertial measurement unit (IMU) can 
offer decimeter-level positioning, bringing us nearer 
to the implementation of automated driving. The 
RINEX perception archives or Google’s GnssLog- 
ger records (additionally alluded to as GnssLog) 
must be processed and create position solutions into 
NMEA records and at long last bring forth outcome 
measurements. 


2.4. Data Visualization 


Folium, a powerful Python library, can be used to 
create different kinds of Leaflet maps that open in 
a distinct HTML file. Additionally, Folium maps 
are interactive. It is possible to make inline Jupyter 
maps with Folium. Matplotlib, a comprehensive 
Python library, can be used to produce static, ani- 
mated, and interactive visualizations and also for 
mapping out and plotting smartphone tracks. With 
Matplotlib, tasks that are straightforward are sim- 
ple to complete and more complicated tasks are still 
possible. (Adavi and Nisha) 


2.5. Model Building 


Building a model requires the collection and under- 
standing of data, and the choice of a statistical, 
mathematical or simulated model to answer ques- 
tions and make predictions. We will be employ- 
ing tree-based learning algorithms with a gradi- 
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FIGURE 1. Flowchart for working of designed algorithm 


FIGURE 2. GNSS Satellite 


ent boosting algorithm known as LightGBM. This 
type of algorithm is a supervised learning system 
designed to address classification and regression by 
forming a tree-like structure. LightGBM is a gradi- 
ent boosting framework that is distributed, efficient 
and has the following advantages: rapid training 
speed, low memory consumption, greater accuracy, 
the capability of parallel, distributed and GPU learn- 
ing, and the capacity to work with huge datasets. 


2.6. Working of LightGBM algorithm 


GNSS raw measurement Data cleaning Decision tree, boosting 
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Precise Positioning Exclusive Feature Bundling 


FIGURE 3. 
Model 


Proposed Design of LightGBM 


The AutoML tool’s Train feature utilizes the deci- 
sion tree-based LightGBM, which is a gradient 
boosting ensemble technique. The purpose of Light- 
GBM is similar to other decision tree-based meth- 
ods, with the goal of optimizing performance in 
distributed systems. LightGBM implements a leaf- 
wise growth pattern which means that, given a con- 
dition, only one leaf is split based on the gain. If the 
data set is small, this can lead to overfitting, which 
can be prevented by limiting the tree depth. Light- 
GBM also has a histogram-based technique which 
divides data into bins based on the distribution’s his- 
togram, instead of iterating, calculating the gain and 
dividing the data. A sparse dataset can also benefit 
from this optimization. In addition, LightGBM has 
exclusive feature bundling, which combines exclu- 
sive features to reduce the dimensionality and speed 
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up the processing. Even if the dataset is of smaller 
size, this approach of optimization can be advanta- 
geous. It has the ability to group together certain 
exclusive characteristics which helps to reduce the 
number of dimensions and quicken the processing. 
This is known as unique feature bundling, and is a 
part of LightGBM. 

When employing the LightGBM algorithm, a 
single-sided sampling (GOSS) is applied to the 
dataset. GOSS weighs data points with larger gradi- 
ents more heavily when determining the gain. This 
approach places higher importance on instances that 
haven’t been used as efficiently for training. To 
maintain precision, some data points are removed 
from the evaluation at random, while others are pre- 
served. 


3. Design 


FIGURE 4. Deployment Architecture 


There are training and testing datasets for the 
Android raw GNSS measurement dataset. The train 
set is used to train the machine learning model, 
while the test set is used to test it. The User Interface 
of the paper is designed using HTML in association 
with CSS and Javascript. HTML stands for Hyper 
Text Markup Language. This is the standard markup 
language used to create Web pages and is responsi- 
ble for defining their structure and comprises a set of 
elements that instruct the browser on how to present 
the content. 

This HTML page is implemented with flask. 
Flask is a popular web framework written in Python, 
that facilitates web page development. Flask helps 
with scalability and flexibility and is easy to nego- 
tiate and lightweight, therefore allowing develop- 
ers to concentrate on coding better. The applica- 
tion is then deployed to remote access on other lap- 
tops using python ngrok tunnels. These tunnels pro- 
vide a secure option to host localhost applications 
on remote systems enabling instant access to multi- 
ple laptops accessing a testing application. 
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4. Comparative Analysis 


The trained LightGBM algorithm can accurately 
forecast the performance in a short amount of 
time compared to the conventional procedure of ” 
modeling-setting parameters-building performance 
simulation,” considerably reducing labor and time 
costs. When compared to other algorithms like 
Decision Tree, KNN, and Random Forest, the Light- 
GBM algorithm’s classification prediction perfor- 
mance is the best. The categorical features may 
introduce bias if they are encoded as numbers 
because they will be considered as ranking numer- 
ical features in the prior research publications that 
use the XGBoost algorithm. Because of this, one- 
hot encoding must be used before feeding into 
XGBoost. Thus, in order to enable categorical input 
type, LightGBM is required. 

When compared with research publication (Li et 
al.) PUA (position update architecture) LightGBM 
Model, it can further be improvised by the integra- 
tion of SLAM(simultaneous localisation and map- 
ping) with PUA-LightGBM Model. It can be use- 
ful in the events of GNSS outage.The link between 
INS data and changes in vehicle position is modeled 
using LightGBM.To ensure that all measurements 
are accurate for the extended Kalman filter (EKF) 
update process, multi-sensor fusion is a vital com- 
ponent in the development of autonomous driving 
systems, which can be enabled by using SLAM. 


5. GNSS Smartphone Positioning Challenges 
and Future References 


Although much effort is being put into smartphone 
positioning, GNSS smartphone positioning is still in 
its infancy. The biggest influence on GNSS accuracy 
is atmospheric interference, which occurs as sig- 
nals travel across space and enter the earth’s atmo- 
sphere. Since GNSS smartphones employ GNSS 
chipsets and antennas suitable for cell phones, the 
observations are quite noisy. Most contemporary 
GNSS receivers have fairly good accuracy because 
they can observe many satellites. Yet occasionally, 
satellite positioning systems may cause multipath 
because of the way they reflect off structures like 
buildings. If the gadget receives both reflected sig- 
nals and direct signals from the line of sight, the 
positioning might be less precise. Additionally, this 
makes it more challenging to discriminate between 
direct line of sight (LOS) transmissions and non-line 
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of sight (NLOS) signals, the latter of which would 
cause a substantial impact of multipath on GNSS 
readings. 

The goal of this paper is to provide readers a 
better knowledge of global navigation satellite sys- 
tems and how they might be applied to smartphone 
positioning. Here, we employ the gradient algo- 
rithm lightGBM, which has a 96.73% accuracy rate 
and can be helpful in GNSS research work, to 
match smartphone GNSS measurements with the 
ground truths supplied by NovAtel Span ISC 100C 
receivers. Hence promising to increase location 
accuracy and opening up new consumer and busi- 
ness opportunities. 

The global market for GNSS smartphones is 
expanding quickly, which presents enormous oppor- 
tunities for both the academic and industrial sec- 
tors. In addition to the studies emphasizing great 
accuracy. Industrial companies are also interested 
in positioning and employing mass-market products 
in this field. Industry insiders predict that in the 
future, high-accuracy applications will be broadly 
adaptable to mass-market gadgets. 


6. Results and Discussion 


The model was trained for the learning rate 0.1, 
with the LightGBM Classifier.The limited user base 
of LightGBM is one drawback, however things are 
quickly changing. Apart from being faster and more 
accurate than XGBOOST, this technique has not 
been widely used because there is not as much doc- 
umentation. However, compared to other boosting 
methods, this technique has shown noticeably supe- 
rior outcomes. 

Instead of moving forward with respect to the 
tree’s nodes like the Extreme Gradient Boosting 
Machine, the Light Gradient Boosting Machine 
moves forward with respect to the leaf.In gradient 
boosting algorithms, gradient-based one-side sam- 
pling and exclusive feature bundling are the two key 
methods employed by light GBM.When compared 
to other gradient boosting models, Light GBM oper- 
ates more quickly and requires less memory. 


7. Conclusion 


Smartphone users will gain advantages from ser- 
vices that have precision down to the lane-level, 
improved experience with location-based gaming, 
and more detail when it comes to road safety issues. 
Android’s access to unprocessed GNSS data gives 
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TABLE 1. Accuracy score 


Type Percent 
LightGBM Model Accuracy 96.73 % 
Score 

Training Set Accuracy Score 96.64 % 
Test Set Accuracy Score 96.73 % 


rise to the potential of creating newer GNSS appli- 
cations for smartphones with accuracy and validity 
that wasn’t feasible before. Because of the afford- 
ability of GNSS smartphones, they can be used in 
many different applications like mapping surveys, 
pedestrian navigation, car navigation, and more. 
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